Skip to content

fix(litellm-amd): re-enable master-key auth + rotate sk-lemonade#1081

Open
yasinBursali wants to merge 1 commit intoLight-Heart-Labs:mainfrom
yasinBursali:fix/litellm-amd-master-key-and-lemonade-rotation
Open

fix(litellm-amd): re-enable master-key auth + rotate sk-lemonade#1081
yasinBursali wants to merge 1 commit intoLight-Heart-Labs:mainfrom
yasinBursali:fix/litellm-amd-master-key-and-lemonade-rotation

Conversation

@yasinBursali
Copy link
Copy Markdown
Contributor

What

  • Re-enable LiteLLM master-key auth on AMD installs (was explicitly unset by the AMD overlay's entrypoint)
  • Rotate the static sk-lemonade outbound key (LiteLLM → Lemonade) to a per-install LITELLM_LEMONADE_API_KEY
  • Bundle cross-extension fixes for perplexica and privacy-shield, which previously sent placeholder keys (no-key, not-needed) to LiteLLM and would 401-break on AMD-local once auth was re-enabled

Why

unset LITELLM_MASTER_KEY is unsafe under LAN mode: the overlay's justification ("all LiteLLM ports bind 127.0.0.1") becomes invalid the moment the operator toggles LAN access (BIND_ADDRESS=0.0.0.0). Port 4000 then becomes LAN-accessible with zero auth, and LiteLLM's env receives any cloud-provider keys the user has set — so any LAN peer can route paid completions through the victim's account. Verified live against macOS (auth enforced → 401) vs current AMD (no auth → 200).

Hardcoded sk-lemonade is defense-in-depth tech debt: every AMD install ships the identical static key. Lemonade backend doesn't currently enforce the key, so direct exploit risk is bounded — but rotation prepares the stack for any future Lemonade-side auth and removes a "single key everywhere" anti-pattern.

Bundled extension fixes: perplexica and privacy-shield route through LiteLLM on AMD-local (LLM_API_URL resolves to http://litellm:4000) and previously sent placeholder API keys that LiteLLM ignored when auth was disabled. Re-enabling auth without these compose updates would silently 401-break both extensions on AMD-local. token-spy was investigated and intentionally NOT bundled — source code review (main.py, start.sh) confirmed it's an Anthropic/OpenAI proxy by design and never forwards through LiteLLM; the OLLAMA_URL line in its compose.yaml is dead env.

How

unset removal — bundled with open-webui fix:

  • extensions/services/litellm/compose.amd.yaml: remove unset LITELLM_MASTER_KEY; refresh stale comment block to reflect reality
  • docker-compose.amd.yml:72: open-webui OPENAI_API_KEY=no-key${LITELLM_KEY}. Without this, fixing the auth alone breaks open-webui on AMD.

sk-lemonade rotation:

  • .env.schema.json: register LITELLM_LEMONADE_API_KEY (secret, not required)
  • installers/phases/06-directories.sh: generate alongside LITELLM_KEY (always computed for shell scope), emit only on AMD via existing AMD_ENV heredoc
  • installers/windows/lib/env-generator.ps1: same generation pattern, AMD-only
  • bin/dream-host-agent.py: 3 sites read from env with sk-lemonade literal fallback for graceful upgrade
  • scripts/bootstrap-upgrade.sh: read from .env with sk-lemonade fallback
  • config/litellm/lemonade.yaml: keep template literal (overwritten per-install at install time) + clarifying comment
  • tests/test_model_activate.py: parameterize the assertion (regex match on api_key: field, not literal)

Cross-extension bundle:

  • extensions/services/perplexica/compose.yaml: OPENAI_API_KEY=${OPENAI_API_KEY:-no-key}${LITELLM_KEY:-${OPENAI_API_KEY:-no-key}}
  • extensions/services/privacy-shield/compose.yaml: TARGET_API_KEY=${TARGET_API_KEY:-not-needed}${LITELLM_KEY:-${TARGET_API_KEY:-not-needed}}

The fallback chain prefers the installer-generated LITELLM_KEY (always present), falls back to user-set explicit key, falls back to original placeholder. On non-LiteLLM-routed paths (default local mode pointing at llama-server) the key is sent but ignored — no regression.

Contract test:

New static guards:

  • tests/test-litellm-amd-auth-enforced.sh: 4 regression guards — fails if unset LITELLM_MASTER_KEY reappears, OPENAI_API_KEY=no-key reappears, or perplexica/privacy-shield fallback chains are reverted

Testing

  • make lint + make test green (17/17 contracts after inversion + 21/21 pytest + 4/4 new static guards)
  • docker compose -f base -f amd -f litellm/compose.yaml -f litellm/compose.amd.yaml config exits 0
  • Negative tests: re-introducing any of the 4 guarded patterns causes the corresponding test to exit 1
  • Live test: ${A:-${B:-default}} nested-default syntax verified with docker compose config against synthetic compose

Manual (recommended):

  • Linux AMD with Lemonade fresh install → .env contains LITELLM_LEMONADE_API_KEY=sk-dream-lemonade-<random-hex> (NOT sk-lemonade); curl http://localhost:4000/v1/models without Bearer key returns 401; open-webui completion request succeeds end-to-end; perplexica + privacy-shield work without 401
  • Linux AMD upgrade path: prior install with sk-lemonade in lemonade.yaml. Run dream-cli upgrade. Verify lemonade.yaml gets new key; verify litellm and open-webui still work.
  • macOS / NVIDIA / CPU installs: completely unaffected — different overlay; ${LITELLM_KEY} first-tier fallback resolves to the same per-install key already in use; behavior unchanged.

Known Considerations

  • Misleading comment in tests/test-litellm-amd-auth-enforced.sh:32-33 ("user-supplied keys still win") — LITELLM_KEY is always installer-generated and non-empty, so it always wins over user-set OPENAI_API_KEY. Cosmetic; behavior is correct.
  • First nested compose default in repo${VAR:-${OTHER:-default}} is supported by the Compose Specification but is the first usage in this codebase. Live-verified safe across platforms.
  • Dead OLLAMA_URL env in token-spy/compose.yaml: never read by source. Separate hygiene-cleanup PR opportunity, not bundled here.

Platform Impact

  • Linux AMD (Lemonade): auth re-enabled, lemonade key rotated, perplexica/privacy-shield work
  • Windows-WSL2 AMD: same as Linux AMD (overlay applies to any gpu_backend=amd)
  • macOS / Linux NVIDIA / Linux CPU: completely unaffected — different overlay, no LITELLM_LEMONADE_API_KEY in their .env; bundled compose changes route through llama-server (not LiteLLM) in default mode → key sent but ignored, no regression

The AMD LiteLLM overlay explicitly ran 'unset LITELLM_MASTER_KEY' before
exec'ing litellm, intentionally disabling auth. The justifying assumption
("All LiteLLM ports bind to 127.0.0.1 — no external exposure") is invalid
once the user enables LAN mode (BIND_ADDRESS=0.0.0.0). Port 4000 then
became LAN-accessible while completely unauthenticated; LiteLLM's env
also receives ANTHROPIC_API_KEY / OPENAI_API_KEY / TOGETHER_API_KEY when
the user has set them, so any LAN peer could route paid completions
through the victim's cloud accounts.

Removes the unset and updates open-webui's hardcoded OPENAI_API_KEY=
no-key to \${LITELLM_KEY} so the AMD inference path keeps working.

Also rotates the hardcoded sk-lemonade outbound key (LiteLLM -> Lemonade
backend) to a per-install LITELLM_LEMONADE_API_KEY (registered in
.env.schema.json, generated alongside LITELLM_KEY in phase 06).
Lemonade itself does not currently verify the key, so the rotation is
defense-in-depth — every install no longer ships an identical static
credential.

Bundles cross-extension fixes for perplexica and privacy-shield, both
of which previously sent placeholder keys to LiteLLM via LLM_API_URL
(perplexica: OPENAI_API_KEY=no-key, privacy-shield: TARGET_API_KEY=
not-needed). Re-enabling LiteLLM auth on AMD-local without these would
401-break both extensions immediately. Each compose.yaml now uses a
LITELLM_KEY-first fallback chain so the installer-generated key
authenticates LiteLLM-routed requests, with the prior placeholders kept
as inert second-tier fallback.

token-spy was investigated and intentionally NOT bundled: source code
review confirmed token-spy is an Anthropic/OpenAI API proxy by design
(reads UPSTREAM_BASE_URL, ANTHROPIC_UPSTREAM, OPENAI_UPSTREAM,
API_PROVIDER); the OLLAMA_URL line in its compose.yaml is dead env
never read by any source file. The 401 risk doesn't apply.

Inverts test-amd-lemonade-contracts.sh contract #4 which previously
codified the bug as required state. Adds tests/test-litellm-amd-auth-
enforced.sh (4 static guards: no unset LITELLM_MASTER_KEY, no
OPENAI_API_KEY=no-key, perplexica fallback chain present,
privacy-shield fallback chain present).
Copy link
Copy Markdown
Collaborator

@Lightheartdevs Lightheartdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right security direction for AMD/LiteLLM, but I found one blocker in the host-agent rewrite path.

The installer writes the rotated value to .env as LITELLM_LEMONADE_API_KEY, but both new host-agent call sites read os.environ.get("LITELLM_LEMONADE_API_KEY", "sk-lemonade"). The host-agent systemd unit does not load .env as an EnvironmentFile, and dream-host-agent.py parses .env into a local dict rather than exporting it into os.environ. So after a dashboard model activation rewrites config/litellm/lemonade.yaml, it can revert the config back to the legacy static key.

Direct repro on this branch:

.env: LITELLM_LEMONADE_API_KEY=sk-dream-lemonade-from-env
_call: _write_lemonade_config(temp_install, "Model.gguf") with process env unset
output: api_key: sk-lemonade

That undercuts the PR's rotation claim exactly on the post-install model-activation path. Please read the key from load_env(INSTALL_DIR / ".env") / the already-loaded env dict in host-agent paths, and add a focused regression test that proves .env wins when process env is unset.

What passed locally: Bash syntax for the touched shell scripts/tests, PowerShell parse for installers/windows/lib/env-generator.ps1, tests/test-litellm-amd-auth-enforced.sh, tests/contracts/test-amd-lemonade-contracts.sh (17/17), and pytest tests/test_model_activate.py -q (21/21). The live integration-smoke red is the shared _docker_cmd_arr expectation mismatch and appears unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants